[Feature] : Add REST API endpoints for access through AI by pulk17 · Pull Request #1117 · CCExtractor/sample-platform

pulk17 · 2026-06-04T08:06:43Z

Please prefix your pull request with one of the following: [FEATURE] [FIX] [IMPROVEMENT].

In raising this pull request, I confirm the following (please check boxes):

I have read and understood the contributors guide.
I have checked that another pull request for this purpose does not exist.
I have considered, and confirmed that this submission will be valuable to others.
I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
I give this submission freely, and claim no ownership to its content.

My familiarity with the project is as follows (check one):

I have never used the project.
I have used the project briefly.
I have used the project extensively, but have not contributed previously.
I am an active contributor to the project.

Feature: JSON REST API for the CCExtractor CI Platform (`mod_api`)

Summary

This PR introduces a complete, spec-driven JSON REST API for the CCExtractor Sample Platform, mounted at /api/v1. It replaces the need for scrapers or direct database access by exposing every CI data point (runs, samples, results, diffs, logs, errors, artifacts, queue status) through authenticated, validated, and rate-limited endpoints.

The API is fully described by the openapi-ci-api.yaml specification and validated against it using Schemathesis property-based testing. It integrates cleanly with the existing data model (mod_test, mod_regression, mod_sample, mod_auth) without modifying any of those modules.

Motivation

The existing platform serves server-rendered HTML pages and lacks a programmatic interface. This makes it impossible for:

External dashboards and monitoring tools to query CI status
Automated pipelines to trigger test runs or approve baselines
Frontend rewrites (e.g., a React or Vue SPA) to consume structured data
Third-party integrations to pull regression test results

This API solves all of the above by providing a standard REST interface with proper authentication, authorization, pagination, filtering, and error handling.

Architecture

The module is organized into five layers, each with a single responsibility:

mod_api/
├── __init__.py          # Blueprint registration + import ordering
├── utils.py             # Pagination helpers, sort utilities
├── models/
│   └── api_token.py     # ApiToken SQLAlchemy model (argon2 hashed)
├── middleware/
│   ├── auth.py          # Bearer token auth + scope/role decorators
│   ├── error_handler.py # Centralized JSON error responses
│   ├── rate_limit.py    # Per-client sliding window rate limiter
│   ├── security.py      # HSTS, CSP, X-Frame-Options headers
│   └── validation.py    # Body, pagination, path ID, date range validators
├── schemas/
│   ├── common.py        # Base schema config
│   ├── auth.py          # Token create/list schemas
│   ├── errors.py        # Error item/summary schemas
│   ├── results.py       # Baseline approval request schema
│   ├── runs.py          # Run list/create/progress schemas
│   ├── samples.py       # Sample history entry schema
│   └── system.py        # Health/queue response schemas
├── services/
│   ├── status.py        # Run/sample status derivation (single source of truth)
│   ├── diff_service.py  # Structured diff computation (JSON hunks)
│   ├── error_service.py # Error classification from TestProgress/TestResult
│   ├── log_service.py   # Build log reader with cursor pagination
│   └── storage.py       # GCS signed URL + local fallback resolution
└── routes/
    ├── auth.py          # Token create, list, revoke
    ├── runs.py          # Run list, create, detail, summary, progress, config, cancel
    ├── samples.py       # Run samples, media catalog, sample history, regression tests
    ├── results.py       # Expected/actual output, structured diffs, baseline approval
    ├── errors_logs.py   # Test errors, infra errors, error summary, build logs
    └── system.py        # Health check, queue status, artifact listing

Why this structure matters

Middleware runs as before_request/after_request hooks, so route handlers never contain auth, validation, or rate-limit boilerplate
Services encapsulate all business logic — route handlers are thin orchestrators that call services and return formatted responses
Schemas (Marshmallow) handle both input validation (load) and output serialization (dump), ensuring the API contract matches the OpenAPI spec
Status derivation is centralized in services/status.py to prevent inconsistent status logic across endpoints

Endpoints

Authentication (`/auth/tokens`)

Method	Path	Auth	Description
`POST`	`/auth/tokens`	None (email/password in body)	Authenticate and issue a scoped Bearer token
`GET`	`/auth/tokens`	Bearer + `tokens:manage`	List your tokens (admins can pass `?all=true`)
`DELETE`	`/auth/tokens/current`	Bearer	Revoke the token in the current request
`DELETE`	`/auth/tokens/{id}`	Bearer	Revoke a token by ID (admins can revoke anyone's)

Runs (`/runs`)

Method	Path	Scope	Description
`GET`	`/runs`	`runs:read`	List runs with filters: `platform`, `branch`, `commit_sha`, `repository`, `status`, `created_after/before`, `sort`
`POST`	`/runs`	`runs:write`	Trigger a new test run for a commit + platform
`GET`	`/runs/{id}`	`runs:read`	Get a single run's details
`GET`	`/runs/{id}/summary`	`runs:read`	Pass/fail/skip/error counts
`GET`	`/runs/{id}/progress`	`runs:read`	Timeline of CI worker progress events
`GET`	`/runs/{id}/config`	`runs:read`	Run configuration and regression test matrix
`POST`	`/runs/{id}/cancel`	`runs:write`	Cancel a queued or running test (idempotent)

Samples (`/samples`, `/runs/{id}/samples`, `/regression-tests`)

Method	Path	Scope	Description
`GET`	`/runs/{id}/samples`	`runs:read`	Per-sample regression test results for a run
`GET`	`/runs/{id}/samples/{sid}`	`runs:read`	Single regression test result in a run
`GET`	`/samples`	`runs:read`	Media sample catalog with filters: `name`, `extension`, `tag`, `sha256`, `status`
`GET`	`/samples/{id}`	`runs:read`	Single media sample detail
`GET`	`/samples/{id}/history`	`runs:read`	Cross-run history with failure signatures
`GET`	`/regression-tests`	`runs:read`	Regression test definitions with `active`, `category`, `tag`, `sample_id` filters

Results (`/runs/{id}/samples/{sid}/regression-tests/{rid}/outputs/{oid}`)

Method	Path	Scope	Description
`GET`	`.../expected`	`results:read`	Expected output file (text or base64)
`GET`	`.../actual`	`results:read`	Actual output file (303 redirect if identical to expected)
`GET`	`.../diff`	`results:read`	Structured JSON diff or unified diff between expected and actual
`POST`	`/runs/{id}/samples/{sid}/baseline-approval`	`baselines:write` + admin/contributor	Approve actual output as the new expected baseline

Errors & Logs (`/runs/{id}/errors`, `/runs/{id}/logs`)

Method	Path	Scope	Description
`GET`	`/runs/{id}/errors`	`results:read`	Test-level errors for a run
`GET`	`/runs/{id}/infrastructure-errors`	`system:read`	Infra errors (VM, build, worker failures)
`GET`	`/runs/{id}/error-summary`	`results:read`	Grouped error counts by type/severity/sample
`GET`	`/runs/{id}/logs`	`system:read`	Build log with cursor-based pagination and `level`, `source`, `contains` filters
`GET`	`/runs/{id}/samples/{sid}/logs`	`system:read`	Reserved placeholder endpoint for per-sample logs (currently returns 404 as the CI worker doesn't support this yet)

System (`/system`)

Method	Path	Auth	Description
`GET`	`/system/health`	None	Health check with DB, local storage, and GCS dependency status
`GET`	`/system/queue`	`system:read`	Queue depth, running count, and job listing
`GET`	`/runs/{id}/artifacts`	`results:read`	List all artifacts for a run (GCS + local, with signed download URLs)

Security

Token Authentication

Tokens are prefixed with spci_ for easy identification in logs and secret scanners
Only the argon2id hash is stored in the database — the plaintext token is returned exactly once at creation time and never logged
Token lookup uses a prefix index for O(1) candidate narrowing, then verifies the full hash against each candidate
Expired and revoked tokens are rejected at the middleware layer before any route handler executes
Failed authentication always performs a dummy hash verification to prevent timing-based user enumeration

Scope-Based Authorization

Six scopes control access granularity:

Scope	Grants
`runs:read`	List/view runs, samples, regression tests
`runs:write`	Trigger new runs, cancel existing runs
`results:read`	View expected/actual output, diffs, errors
`baselines:write`	Approve new baselines (admin/contributor only)
`system:read`	View queue, infrastructure errors, build logs
`tokens:manage`	List and revoke tokens

Non-admin users cannot request baselines:write. The scope check returns 403 with details about which scopes are missing and which scopes the token holds, making debugging straightforward.

Role Enforcement

Certain operations additionally require specific user roles:

POST /runs/{id}/cancel — admin, contributor, or tester
POST /runs/{id}/samples/{sid}/baseline-approval — admin or contributor
POST /runs (Triggering runs):
- Scope: You ALWAYS need the runs:write scope, regardless of the repository.
- Role: To trigger runs for the main repository, you must be an admin, contributor, or tester. To trigger runs for your own fork repository, any authenticated user (who owns the fork and has runs:write scope) can do so.

Rate Limiting

Per-client sliding window rate limits protect against abuse:

Endpoint	Limit	Window	Key
`POST /auth/tokens`	5 requests	15 min	Client IP
Mutating endpoints (POST/DELETE/PUT/PATCH)	20 requests	1 min	Token ID
Read endpoints (GET)	120 requests	1 min	Token ID

Rate limit headers (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, Retry-After) are included on every response. A background eviction process prunes stale entries every 100 requests to bound memory usage.

Caveat: Rate limit state is stored in-memory (thread-safe dict). This works perfectly for single-process/single-server deployments and multi-threaded Gunicorn. However, if the platform is ever scaled to multiple Gunicorn workers or multiple servers behind a load balancer, the rate limiter should be migrated to Redis so that counters are shared across processes. This is a known limitation documented in the code.

Rate limiting is automatically disabled when app.config['TESTING'] is True to prevent interference with test suites. In production, the TESTING flag is never set.

Security Headers

Every API response includes:

Strict-Transport-Security: max-age=31536000; includeSubDomains
Content-Security-Policy: default-src 'none'; frame-ancestors 'none'
X-Content-Type-Options: nosniff
X-Frame-Options: DENY

Path Traversal Protection

All file-serving endpoints (/expected, /actual, /diff) use _safe_resolve() which calls os.path.realpath() and verifies the resolved path is within the TestResults base directory. File extensions from the database are sanitized to remove /, \, and .. sequences before being appended to filenames.

Input Validation

All validation happens at the middleware layer through reusable decorators:

@validate_body(Schema) — Validates JSON body against a Marshmallow schema. Returns 415 for non-JSON content types, 400 for malformed JSON, and 400 with field-level error details for schema violations
@validate_offset_pagination() — Extracts and validates limit (1–100) and offset (≥0, ≤2147483647). Rejects requests that mix cursor and offset pagination
@validate_cursor_pagination() — Extracts and validates limit and cursor for log endpoints
@validate_path_id('param') — Ensures URL path parameters are positive integers within int32 range
@validate_date_range — Parses created_after/created_before ISO 8601 params and rejects inverted ranges
@validate_sort() — Validates the sort query parameter against a whitelist (created_at, -created_at, run_id, -run_id)
LIKE wildcard escaping — The GET /samples name filter escapes % and _ to prevent SQL LIKE injection

Status Derivation

Status logic is centralized in services/status.py to prevent inconsistencies:

Run Status

The existing test.failed property only checks for TestStatus.canceled and is not reliable for determining whether regression tests passed. Instead, we derive status from the latest TestProgress row and, for completed runs, count actual failures from TestResult rows:

Derived Status	Condition
`queued`	No TestProgress rows
`running`	Latest progress is `preparation` or `testing`
`canceled`	Latest progress is `canceled`
`pass`	Latest is `completed` + zero failures
`fail`	Latest is `completed` + one or more failures
`incomplete`	Any other raw status

Sample Status

Derived Status	Condition
`pass`	Exit code matches expected AND all outputs match
`fail`	Exit code mismatch OR any output has a diff
`missing_output`	Dummy sentinel row detected
`not_started`	No TestResult row exists

Dummy Row Detection

The CI worker writes a sentinel TestResultFile row with regression_test_output_id = -1 and got = 'error' when a test produces no output. This row is filtered from all API responses and drives the missing_output status.

Deployment prerequisite: Before deploying to production, run this query to verify no old-format sentinel rows exist that would be missed:
SELECT COUNT(*) FROM test_result_file
WHERE (test_id = -1 OR regression_test_id = -1)
  AND NOT (regression_test_output_id = -1 AND got = 'error');
If the count is > 0, a data migration is needed first.

Pagination

Two pagination strategies are used:

Offset Pagination (most endpoints)

{
  "data": [...],
  "pagination": {
    "limit": 50,
    "offset": 0,
    "total": 142,
    "next_offset": 50
  }
}

Cursor Pagination (build logs)

{
  "data": [...],
  "pagination": {
    "limit": 100,
    "next_cursor": 3847
  }
}

Cursor pagination is used for logs because they are read from files, not from the database, and offset-based access would require reading and discarding lines on every request.

Error Response Format

All errors follow a consistent structure:

{
  "code": "validation_error",
  "message": "Request failed schema validation.",
  "details": {
    "fields": {
      "email": ["Missing data for required field."],
      "password": ["Shorter than minimum length 1."]
    }
  }
}

HTTP exceptions (404, 405, 500) are caught and reformatted into this structure so that API clients never receive HTML error pages.

Performance Considerations

Batch Status Derivation

batch_get_run_data() in services/status.py preloads all TestProgress, TestResult, and TestResultFile rows for a batch of runs in three queries (instead of N+1), then computes statuses in-memory. This is critical for the GET /runs list endpoint.

Status Filtering Caveat

When GET /runs?status=running is requested, the API must load up to 1,000 runs and compute their derived status in-memory (because status is not a database column). A truncated: true flag is included in the pagination response when the 1,000-run cap is hit. This is documented in the response schema.

Result File Preloading

Endpoints that iterate over TestResult rows (run summary, run samples, sample history) use defaultdict-based preloading of TestResultFile rows to avoid N+1 queries.

Diff Size Limits

The diff endpoint rejects files larger than 10 MiB and directs clients to the download URL instead. Output files are truncated at 1 MiB with a truncated: true flag and a GCS download link for the full file.

Database Changes

New Table: `api_token`

Column	Type	Notes
`id`	`INT PRIMARY KEY`	Auto-increment
`user_id`	`INT FK → user.id`	CASCADE on delete/update
`token_name`	`VARCHAR(50)`	Unique per user (`uq_user_token_name`)
`token_hash`	`VARCHAR(255)`	Argon2id hash
`token_prefix`	`VARCHAR(16)`	First 16 chars, indexed for fast lookup
`scopes_json`	`TEXT`	JSON array of scope strings
`created_at`	`DATETIME(tz)`	UTC timestamp
`expires_at`	`DATETIME(tz)`	UTC timestamp
`revoked_at`	`DATETIME(tz)`	NULL until revoked

No existing tables are modified.

Test Coverage

Test Organization (265 tests total)

tests/api/
├── test_middleware_auth.py           (13 tests)  — Token validation, public endpoints, scope/role checks
├── test_middleware_error_handler.py  (9 tests)   — JSON error formatting, HTTP exception handling
├── test_middleware_rate_limit.py     (8 tests)   — Sliding window, eviction, header injection
├── test_middleware_validation.py     (13 tests)  — Body, pagination, path ID, date range, sort validation
├── test_models_api_token.py          (3 tests)   — Token generation, hashing, expiry, revocation
├── test_routes_auth.py               (17 tests)  — Token lifecycle: create, list, revoke, edge cases
├── test_routes_errors_logs.py        (12 tests)  — Error listing, infra errors, error summary, log reading
├── test_routes_results.py            (12 tests)  — Expected/actual output, diff, baseline approval
├── test_routes_runs.py               (19 tests)  — Run CRUD, filtering, sorting, cancellation
├── test_routes_samples.py            (10 tests)  — Sample catalog, history, regression test listing
├── test_routes_system.py             (8 tests)   — Health check, queue status, artifact listing
├── test_schemathesis.py              (79 tests)  — OpenAPI contract fuzzing against every endpoint
├── test_services_diff_service.py     (10 tests)  — Structured diff, hunk parsing, edge cases
├── test_services_error_service.py    (8 tests)   — Error classification from TestProgress/TestResult
├── test_services_log_service.py      (10 tests)  — Log reading, cursor pagination, filtering
├── test_services_status.py           (14 tests)  — Run/sample status derivation, batch computation
├── test_services_storage.py          (8 tests)   — GCS signed URLs, local fallback, path resolution
└── test_utils.py                     (12 tests)  — Pagination helpers, sort column resolution

Schemathesis (Property-Based Testing)

The test_schemathesis.py module validates every endpoint against the OpenAPI specification using Schemathesis, which generates hundreds of edge-case inputs per endpoint. It covers:

Broad schema fuzzing across all endpoints
Per-endpoint targeted validation (runs, samples, auth, results, diffs)
Negative security testing (invalid auth, missing headers)
Response invariant checks
Boundary and edge-case coverage

Hypothesis is configured with max_examples=5 and deadline=None to keep CI runtime reasonable (~7 minutes locally) while still providing meaningful coverage. Rate limiting is bypassed during testing via app.config['TESTING'].

What the tests mock

config_parser.parse_config → returns an in-memory SQLite config
google.cloud.storage.Client.from_service_account_json → returns a mock GCS client
Individual service functions are mocked at the route level to isolate handler logic from DB state

New Files

Path	Lines	Purpose
`mod_api/__init__.py`	28	Blueprint setup, middleware/route registration
`mod_api/utils.py`	73	Pagination and sort helpers
`mod_api/models/api_token.py`	142	ApiToken model with argon2 hashing
`mod_api/middleware/auth.py`	132	Bearer auth + scope/role decorators
`mod_api/middleware/error_handler.py`	~110	Centralized error response formatting
`mod_api/middleware/rate_limit.py`	~130	Sliding window rate limiter
`mod_api/middleware/security.py`	12	Security response headers
`mod_api/middleware/validation.py`	321	Input validation decorators
`mod_api/schemas/*.py`	~400	Marshmallow schemas for all endpoints
`mod_api/services/status.py`	211	Status derivation logic
`mod_api/services/diff_service.py`	202	Structured diff computation
`mod_api/services/error_service.py`	~230	Error classification
`mod_api/services/log_service.py`	~100	Build log reader
`mod_api/services/storage.py`	65	GCS + local artifact resolution
`mod_api/routes/*.py`	~1,900	All endpoint handlers
`tests/api/*.py`	~3,000	265 tests across 19 test files
`openapi-ci-api.yaml`	~2,700	OpenAPI 3.0 specification

Total new code: ~5,500 lines (source) + ~3,000 lines (tests) + ~2,700 lines (OpenAPI spec)

Dependencies

New pip dependencies:

argon2-cffi — Token hashing (argon2id)
marshmallow — Schema validation and serialization
schemathesis — OpenAPI contract testing (test dependency only)
hypothesis — Property-based testing engine (test dependency only, pulled in by schemathesis)

All other dependencies (flask, sqlalchemy, passlib, google-cloud-storage) were already in requirements.txt.

How to Test Locally

# Activate virtual environment
source .venv/bin/activate  # or .\.venv\Scripts\activate on Windows

# Run all API tests
pytest tests/api/ -v

# Run only unit tests (skip schemathesis)
pytest tests/api/ -v --ignore=tests/api/test_schemathesis.py

# Run linters against API code
python run_checks.py mod_api tests/api --skip-nose

Known Limitations & Future Work

Rate limit storage: In-memory only (see Security section above). Migrate to Redis if scaling to multiple workers.
Status filtering: GET /runs?status=X requires in-memory filtering with a 1,000-run cap. A future improvement would be to materialize run status as a database column updated by a trigger or the CI worker.
Per-sample logs: GET /runs/{id}/samples/{sid}/logs returns 404 because the CI worker doesn't produce per-sample log files. The endpoint is implemented and documented as a placeholder for when the worker adds this capability.
Dummy row detection: See the deployment prerequisite SQL query in the Status Derivation section above.
GCS blob existence: The artifact endpoint does not call blob.exists() before generating signed URLs (to avoid latency). Clients should handle 404s from the signed URL gracefully.

sonarqubecloud · 2026-06-21T07:48:25Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.5% Duplication on New Code

See analysis details on SonarQube Cloud

cfsmp3 · 2026-06-23T19:55:46Z

While @canihavesomecoffee has a bit of time to review, some notes:

This PR is huge. It should be split into easy to review and merge ones - start with one that does the basic scaffolding and then ones that add the different endpoints (in a way that they can be reviewed and merged in order).
Other than that it well organized so splitting it should be easy since not a lot of refactoring will be needed (or maybe nothing at all)

I asked claude to review it. Take a look:

CRITICAL

C1 — Missing-output detection is broken; failing runs report pass. (verified)
services/status.py:71 is_dummy_row() keys on a TestResultFile row with regression_test_output_id == -1, got == 'error'. That row is never written to the DB. The only persisted TestResultFile rows (mod_ci/controllers.py:2588,2670) use a real rto.id. The TestResultFile(-1,-1,-1,'',got) at mod_test/controllers.py:118 is an in-memory object for HTML rendering only — never db.add-ed. The real "missing output" condition is absence of TestResultFile rows while the regression test has a non-ignored RegressionTestOutput (see reference logic at mod_test/controllers.py:107-118).

Consequence in derive_sample_status (status.py:57-69): empty result_files → both loops skip → if exit code matches → returns pass. Cascades into run status, /summary counts, /errors, /error-summary. The whole point of this platform is catching regressions; this silently passes tests that produced no output. The "DEPLOYMENT PREREQUISITE" SQL in the docstring is also misleading — it checks for old rows that were likewise never persisted. Must rewrite missing-output derivation against expected RegressionTestOutputs.

HIGH

H1 — Privilege escalation: any user can revoke anyone's tokens. (verified) routes/auth.py:64 puts tokens:manage in allowed_scopes for every role (only baselines:write is admin-gated). revoke_specific_token (auth.py:175) has no @require_roles; its only cross-user guard is has_scope('tokens:manage') — which a plain user now holds. → low-priv user can revoke admins' tokens (lockout/DoS). Fix: gate tokens:manage to admins, or restrict non-admins to revoking their own tokens regardless of scope.

H2 — POST /runs ownership bypass → arbitrary-repo CI builds. routes/runs.py:116 _validate_run_permissions: is_staff (admin/tester/contributor) short-circuits the ownership check, and repository is only regex-validated, never confirmed to exist/be-owned. A contributor can enqueue a run against any owner/repo string, creating Fork rows and making build VMs clone/build attacker-chosen repos. SSRF/resource-abuse surface.

H3 — 'testuser' short-circuit poisons real data. mod_auth/controllers.py: fetch_username_from_token returns 'testuser' whenever config['TESTING'] is truthy, and github_callback now persists github_login = 'testuser' to the real user row. TESTING is env-var controlled (run.py:109). If ever set in a prod-ish env, it corrupts the value H2's owner-check trusts. Don't couple security-relevant data to TESTING, and never write the stub to the DB.

H4 — API 500s return HTML, not JSON. middleware/error_handler.py:101 @mod_api.errorhandler(500) never fires — Flask doesn't dispatch the 500 code to blueprint handlers; the app-level handler renders 500.html. after_app_request only JSON-ifies 404/405. Any unhandled exception breaks the "always JSON" contract. The unit test calls the handler directly so it passes. Fix at app level for /api/v1 paths. (The MarshmallowValidationError/SQLAlchemyError class handlers do work — only the 500 one is dead.)

H5 — verify_schemathesis.py (938 lines) is dead. CI uses nose2 which collects test*.py — this file (verify_*) is never collected, and it imports schemathesis/hypothesis which aren't in requirements (ImportError if run). The headline "contract test" provides zero coverage. Wire it in or drop it.

H6 — N+1 in get_sample_history. routes/samples.py:318 calls get_run_timestamps → batch_get_run_data([test]) per row = 3 queries × N runs, re-querying already-loaded data. Defeats the PR's own batching claim. Batch once over the distinct tests.

H7 — POST /runs/{id}/cancel check-then-act race. routes/runs.py:481 reads status, then inserts a canceled TestProgress with no lock/re-check → duplicate cancels, or canceled written after completed (flips a finished run). "Idempotent" only holds on the already-terminal fast path. Use with_for_update() and re-check.

MEDIUM (condensed)

Rate limiter (middleware/rate_limit.py): it's a fixed window mislabeled "sliding" (allows 2× burst at boundary); per-process so N× limits under multiple gunicorn workers; unbounded global dict = memory-exhaustion DoS (eviction only >900s, no cap); remote_addr keying self-DoSes the 5/15min token endpoint if behind a proxy without ProxyFix.
Flask-Limiter declared but never imported — dead dependency (the limiter is hand-rolled).
Scope creep: ~20 unrelated dep bumps (sqlalchemy, flask, lxml, google libs, pytz==2026.2) + test-dep bumps (mypy 2.1, isort 8) bundled into a feature PR. Split them; couples the API merge to a possible lint break. Confirm pytz==2026.2/mypy==2.1.0 actually exist on PyPI.
Loose pins marshmallow>=3.21 (allows breaking 4.x), argon2-cffi>=23.0, Flask-Limiter>=3.5 in an otherwise all-== file. Pin exactly.
diff_service.read_lines does f.readlines() (whole file) with the 10MiB guard only in the route, not the service.
storage.resolve_artifact local fallback has no traversal guard (defense-in-depth gap; current callers feed DB hashes).
GCS signed URLs minted without blob.exists() → broken links / bucket-layout leak.
validate_date_range can compare naive vs aware datetimes → TypeError → 500 (HTML, per H4).
Status filter 1000-run cap silently drops matches past position 1000 and reports an inconsistent total.
/summary total_samples (planned set) never reconciles with pass/fail/missing counts (existing TestResult rows only).
DELETE token scope spec mismatch; conftest.py pytest fixture dead under nose2.

LOW/NIT

No Cache-Control: no-store on the one-time token response; timing oracle on the no-candidate auth path (no dummy verify, unlike the login path); no unique constraint on token_prefix; extension sanitization brittle (_safe_resolve saves it); progress.step hardcoded None; HSTS sent unconditionally.

Bottom line: C1 alone blocks merge — the API would misreport CI results, defeating the platform's purpose. H1–H4 are real security/contract bugs. Architecture and tests are solid; this is a fixable PR, not a rewrite. Recommend: request changes, lead with C1, require the dep bumps be split out.

Want me to (a) post this as a PR review on GitHub, (b) save it to plans/PR_1117_REVIEW.md, or (c) deep-dive/repro any specific finding (e.g. write a failing test that proves C1)?

cfsmp3

See comments

pulk17 requested review from canihavesomecoffee and thealphadollar as code owners June 4, 2026 08:06

pulk17 marked this pull request as draft June 5, 2026 13:55

pulk17 changed the title ~~[Feature] : Add REST API endpoints for access thorough AI (Pending !)~~ [Feature] : Add REST API endpoints for access through AI (Pending !) Jun 5, 2026

pulk17 force-pushed the Feature/REST-API-Endpoints branch 2 times, most recently from ab72a13 to 1441393 Compare June 8, 2026 14:39

pulk17 changed the title ~~[Feature] : Add REST API endpoints for access through AI (Pending !)~~ [Feature] : Add REST API endpoints for access through AI (WIP !) Jun 11, 2026

gaurav02081 mentioned this pull request Jun 14, 2026

feat: add sp CLI — command-line client for the platform REST API #1121

Open

Implement REST API Endpoints

094f3af

pulk17 force-pushed the Feature/REST-API-Endpoints branch from 1f3c1d5 to 094f3af Compare June 17, 2026 07:24

pulk17 changed the title ~~[Feature] : Add REST API endpoints for access through AI (WIP !)~~ [Feature] : Add REST API endpoints for access through AI Jun 17, 2026

pulk17 marked this pull request as ready for review June 21, 2026 07:47

Merge branch 'master' into Feature/REST-API-Endpoints

982b489

cfsmp3 self-requested a review June 23, 2026 19:56

cfsmp3 requested changes Jun 23, 2026

View reviewed changes

gaurav02081 mentioned this pull request Jun 23, 2026

feat: sp — command-line client for the Sample Platform REST API CCExtractor/sp_cli#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] : Add REST API endpoints for access through AI #1117

[Feature] : Add REST API endpoints for access through AI #1117
pulk17 wants to merge 2 commits into
CCExtractor:masterfrom
pulk17:Feature/REST-API-Endpoints

pulk17 commented Jun 4, 2026 •

edited

Loading

Uh oh!

sonarqubecloud Bot commented Jun 21, 2026

Uh oh!

cfsmp3 commented Jun 23, 2026

Uh oh!

cfsmp3 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pulk17 commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Feature: JSON REST API for the CCExtractor CI Platform (mod_api)

Summary

Motivation

Architecture

Why this structure matters

Endpoints

Authentication (/auth/tokens)

Runs (/runs)

Samples (/samples, /runs/{id}/samples, /regression-tests)

Results (/runs/{id}/samples/{sid}/regression-tests/{rid}/outputs/{oid})

Errors & Logs (/runs/{id}/errors, /runs/{id}/logs)

System (/system)

Security

Token Authentication

Scope-Based Authorization

Role Enforcement

Rate Limiting

Security Headers

Path Traversal Protection

Input Validation

Status Derivation

Run Status

Sample Status

Dummy Row Detection

Pagination

Offset Pagination (most endpoints)

Cursor Pagination (build logs)

Error Response Format

Performance Considerations

Batch Status Derivation

Status Filtering Caveat

Result File Preloading

Diff Size Limits

Database Changes

New Table: api_token

Test Coverage

Test Organization (265 tests total)

Schemathesis (Property-Based Testing)

What the tests mock

New Files

Dependencies

How to Test Locally

Known Limitations & Future Work

Uh oh!

sonarqubecloud Bot commented Jun 21, 2026

Quality Gate passed

Uh oh!

cfsmp3 commented Jun 23, 2026

Uh oh!

cfsmp3 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pulk17 commented Jun 4, 2026 •

edited

Loading

Feature: JSON REST API for the CCExtractor CI Platform (`mod_api`)

Authentication (`/auth/tokens`)

Runs (`/runs`)

Samples (`/samples`, `/runs/{id}/samples`, `/regression-tests`)

Results (`/runs/{id}/samples/{sid}/regression-tests/{rid}/outputs/{oid}`)

Errors & Logs (`/runs/{id}/errors`, `/runs/{id}/logs`)

System (`/system`)

New Table: `api_token`